Download Expressive Oriented Time-Scale Adjustment for Mis-Played Musical Signals Based on Tempo Curve Estimations
Musical recordings, when performed by non-proficient (amateur) performers, include two types of tempo fluctuations–intended “tempo curves” and non-intended “mis-played components”–due to poor control of instruments. In this study, we propose a method for estimating intended tempo fluctuations, called “true tempo curves,” from mis-played recordings. We also propose an automatic audio signal modification that can adjust the signal by time-scale modification with an estimated true tempo curve to remove the mis-played component. Onset timings are detected by an onset detection method based on the human auditory system. The true tempo curve is estimated by polynomial regression analysis using detected onset timings and score information. The power spectrograms of the observed musical signals are adjusted using the true tempo curve. A subjective evaluation was performed to test the closeness of the rhythm, and it was observed that the mean opinion score values of the adjusted sounds were higher than those of the original recorded sound, and significant differences were observed for all tested instruments.
Download Modeling and Rendering for Virtual Dropping Sound based on Physical Model of Rigid Body
Sound production by means of a physical model for falling objects, which is intended for audio synthesis of immersive contents, is described here. Our approach is a mathematical model to synthesize sound and audio for animation with rigid body simulation. To consider various conditions, a collision model of an object was introduced for vibration and propagation simulation. The generated sound was evaluated by comparing the model output with real sound using numerical criteria and psychoacoustic analysis. Experiments were performed for a variety of objects and floor surfaces, approximately 90% of which were similar to real scenarios. The usefulness of the physical model for audio synthesis in virtual reality was represented in terms of breadth and quality of sound.